同时翻译系统在处理输入流中的部分源句子时开始产生输出。这些系统需要决定何时读取更多输入以及何时编写输出。这些决定取决于源/目标语言的结构以及部分输入序列中包含的信息。因此,读/写决策策略在不同的输入方式(即语音和文本)中保持不变。这激发了我们利用与语音输入相对应的文本成绩单,以改善同时的语音到文本翻译(Simulst)。我们建议通过同时使用文本到文本翻译(SIMULMT)任务来改善Simulst系统的决策政策,以改善Simulst系统的决策政策。我们还将几种技术从离线语音翻译域扩展,以探索Simulmt任务在改善Simulst性能中的作用。总体而言,我们在不同的延迟制度(ENDE)SIMULST任务的不同延迟制度中取得了34.66% / 4.5 BLEU的改进。
translated by 谷歌翻译
Agile robotics presents a difficult challenge with robots moving at high speeds requiring precise and low-latency sensing and control. Creating agile motion that accomplishes the task at hand while being safe to execute is a key requirement for agile robots to gain human trust. This requires designing new approaches that are flexible and maintain knowledge over world constraints. In this paper, we consider the problem of building a flexible and adaptive controller for a challenging agile mobile manipulation task of hitting ground strokes on a wheelchair tennis robot. We propose and evaluate an extension to work done on learning striking behaviors using a probabilistic movement primitive (ProMP) framework by (1) demonstrating the safe execution of learned primitives on an agile mobile manipulator setup, and (2) proposing an online primitive refinement procedure that utilizes evaluative feedback from humans on the executed trajectories.
translated by 谷歌翻译
Although recent deep learning-based calibration methods can predict extrinsic and intrinsic camera parameters from a single image, their generalization remains limited by the number and distribution of training data samples. The huge computational and space requirement prevents convolutional neural networks (CNNs) from being implemented in resource-constrained environments. This challenge motivated us to learn a CNN gradually, by training new data while maintaining performance on previously learned data. Our approach builds upon a CNN architecture to automatically estimate camera parameters (focal length, pitch, and roll) using different incremental learning strategies to preserve knowledge when updating the network for new data distributions. Precisely, we adapt four common incremental learning, namely: LwF , iCaRL, LU CIR, and BiC by modifying their loss functions to our regression problem. We evaluate on two datasets containing 299008 indoor and outdoor images. Experiment results were significant and indicated which method was better for the camera calibration estimation.
translated by 谷歌翻译
Neglected tropical diseases (NTDs) continue to affect the livelihood of individuals in countries in the Southeast Asia and Western Pacific region. These diseases have been long existing and have caused devastating health problems and economic decline to people in low- and middle-income (developing) countries. An estimated 1.7 billion of the world's population suffer one or more NTDs annually, this puts approximately one in five individuals at risk for NTDs. In addition to health and social impact, NTDs inflict significant financial burden to patients, close relatives, and are responsible for billions of dollars lost in revenue from reduced labor productivity in developing countries alone. There is an urgent need to better improve the control and eradication or elimination efforts towards NTDs. This can be achieved by utilizing machine learning tools to better the surveillance, prediction and detection program, and combat NTDs through the discovery of new therapeutics against these pathogens. This review surveys the current application of machine learning tools for NTDs and the challenges to elevate the state-of-the-art of NTDs surveillance, management, and treatment.
translated by 谷歌翻译
Deep learning approaches for spatio-temporal prediction problems such as crowd-flow prediction assumes data to be of fixed and regular shaped tensor and face challenges of handling irregular, sparse data tensor. This poses limitations in use-case scenarios such as predicting visit counts of individuals' for a given spatial area at a particular temporal resolution using raster/image format representation of the geographical region, since the movement patterns of an individual can be largely restricted and localized to a certain part of the raster. Additionally, current deep-learning approaches for solving such problem doesn't account for the geographical awareness of a region while modelling the spatio-temporal movement patterns of an individual. To address these limitations, there is a need to develop a novel strategy and modeling approach that can handle both sparse, irregular data while incorporating geo-awareness in the model. In this paper, we make use of quadtree as the data structure for representing the image and introduce a novel geo-aware enabled deep learning layer, GA-ConvLSTM that performs the convolution operation based on a novel geo-aware module based on quadtree data structure for incorporating spatial dependencies while maintaining the recurrent mechanism for accounting for temporal dependencies. We present this approach in the context of the problem of predicting spatial behaviors of an individual (e.g., frequent visits to specific locations) through deep-learning based predictive model, GADST-Predict. Experimental results on two GPS based trace data shows that the proposed method is effective in handling frequency visits over different use-cases with considerable high accuracy.
translated by 谷歌翻译
The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert behavior can lead to poor results when given sub-optimal experts. We compare several common approaches for skill transfer on multiple domains including changes in task and system dynamics. We identify how existing methods can fail and introduce an alternative approach to mitigate these problems. Our approach learns to sequence existing temporally-extended skills for exploration but learns the final policy directly from the raw experience. This conceptual split enables rapid adaptation and thus efficient data collection but without constraining the final solution.It significantly outperforms many classical methods across a suite of evaluation tasks and we use a broad set of ablations to highlight the importance of differentc omponents of our method.
translated by 谷歌翻译
Disentanglement of constituent factors of a sensory signal is central to perception and cognition and hence is a critical task for future artificial intelligence systems. In this paper, we present a compute engine capable of efficiently factorizing holographic perceptual representations by exploiting the computation-in-superposition capability of brain-inspired hyperdimensional computing and the intrinsic stochasticity associated with analog in-memory computing based on nanoscale memristive devices. Such an iterative in-memory factorizer is shown to solve at least five orders of magnitude larger problems that cannot be solved otherwise, while also significantly lowering the computational time and space complexity. We present a large-scale experimental demonstration of the factorizer by employing two in-memory compute chips based on phase-change memristive devices. The dominant matrix-vector multiply operations are executed at O(1) thus reducing the computational time complexity to merely the number of iterations. Moreover, we experimentally demonstrate the ability to factorize visual perceptual representations reliably and efficiently.
translated by 谷歌翻译
Cement is the most used construction material. The performance of cement hydrate depends on the constituent phases, viz. alite, belite, aluminate, and ferrites present in the cement clinker, both qualitatively and quantitatively. Traditionally, clinker phases are analyzed from optical images relying on a domain expert and simple image processing techniques. However, the non-uniformity of the images, variations in the geometry and size of the phases, and variabilities in the experimental approaches and imaging methods make it challenging to obtain the phases. Here, we present a machine learning (ML) approach to detect clinker microstructure phases automatically. To this extent, we create the first annotated dataset of cement clinker by segmenting alite and belite particles. Further, we use supervised ML methods to train models for identifying alite and belite regions. Specifically, we finetune the image detection and segmentation model Detectron-2 on the cement microstructure to develop a model for detecting the cement phases, namely, Cementron. We demonstrate that Cementron, trained only on literature data, works remarkably well on new images obtained from our experiments, demonstrating its generalizability. We make Cementron available for public use.
translated by 谷歌翻译
现有的自我监督学习策略被限制在有限的目标或主要针对单峰应用程序的通用下游任务。对于复杂性和域亲和力(例如模因分析)而言,这对命令性的多模式应用有了孤立的进展。在这里,我们介绍了两种自我监督的预训练方法,即ext-pie-net和mm-simclr(i)在预训练期间使用现成的多模式仇恨语音数据,并且(ii)执行自我 - 通过合并多个专业借口任务,有效地迎合模因分析所需的复杂多模式表示学习,从而有效地迎合了学习。我们实验不同的自我实验策略,包括可以帮助学习丰富的跨模式表示并使用流行的线性探测来评估可恨模因任务的潜在变体。拟议的解决方案通过标签有效的培训与完全监督的基线竞争,同时在梅诺特挑战的所有三个任务上明显优于他们,分别为0.18%,23.64%和0.93%的绩效增长。此外,我们通过在Harmeme任务上报告竞争性能来证明所提出的解决方案的普遍性。最后,我们通过分析特定于任务的学习,使用更少的标记培训样本来建立学习表现的质量,并争辩说,自主策略和手头下游任务的复杂性是相关的。我们的努力强调了更好的多模式自学方法的要求,涉及有效的微调和可推广性能的专业借口任务。
translated by 谷歌翻译
在社交媒体中发现进攻性语言是社交媒体面临的主要挑战之一。研究人员提出了许多高级方法来完成这项任务。在本报告中,我们尝试利用他们的方法中的学习,并结合我们的想法以改进它们。我们在对进攻推文分类中成功实现了74%的准确性。我们还列出了社交媒体界的滥用内容检测中的即将到来的挑战。
translated by 谷歌翻译